Acoustic-phonetic labels in a Japanese speech database
نویسندگان
چکیده
A large sized Japanese speech database at ATR(JSDB-ATR) is introduced. Thesespeech data are transcribed in multiple ways using acoustic-phonetic symbols for various data access requests and for the convenience of fine acoustic-phonetic analysis. For multiple transcription, three types of categories are considered: linguistic and phonemic categories, acoustic event categories and some alophonic variation categories. To date, about 8500 words respectively uttered by eight professional announcers have been collected with half of them being acoustically-phonetically transcribed. INTRODUCTION Recently, the construction of speech databases has been undertaken in many languages to obtain much knowledge for speech recognition, perception and syntheses[l]-[3]. However, there are few Japanese speech database (JSDB) large enough for various research purposes.ln this paper, a large Japanese speech database that is being built at ATR (JSDB-ATR) is introduced focusing on its multiple acoustic-phonetic transcriptions.
منابع مشابه
Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training
For the purpose of developing Computer Assisted Pronunciation Training (CAPT) technology with more informative feedbacks, we propose to use a set of narrowphonetic labels to annotate Chinese L2 speech database of Japanese learners. The labels include basic units of “Initials”, “Finals” for Chinese phonemes and diacritics for erroneous articulation tendencies. Pilot investigations were made on t...
متن کاملA linguistic and prosodic database for data-driven Japanese TTS synthesis
We propose a method to generate a database that contains a parametric representation of F0 contours associated with linguistic and acoustic information, to be used by data-driven Japanese text-to-speech (TTS) systems. The configuration of the database includes recorded speech, F0 contours and their parametric labels, phonetic transcription with durations, and other linguistic information such a...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملCorpus of Spontaneous Japanese: Its Design and Evaluation
Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, preliminary evaluation of the CSJ was presented. The results suggest strongly the usefulness of the CSJ as the resource for the study of spo...
متن کاملUse of a Large-scale Spontaneous Speech Corpus in the Study of Linguistic Variation
Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, the potential of the CSJ as a resource for linguistic variation study was evaluated.
متن کامل